Inferences for current chronic graft-versus-host-disease free and relapse free survival

This paper provides the methodologies of a new summary curve that measures the dynamic outcome following allogenic hematopoietic cell transplantation. This new summary curve computes the probabilities that a patient is alive in remission and free of severe-to-moderate chronic graft-versus-host disease (GVHD) over time. The probability is called Current chronic GVHD-free, Relapse-Free Survival (CGRFS). Based on a multistate model depicting the possible states that a patient may experience after transplant, CGRFS can be formulated as a linear combination of five survival functions. This method is known as the model-free approach. In this paper we provide the inferences of the model-free approach, including estimation of CGRFS, precision evaluation and comparison of CGRFS between two independent samples.


Introduction
Several survival curves are often depicted to show the outcomes after allogenic hematopoietic cell transplantation. Among these curves, overall survival (OS) and disease-free survival (DFS) are most commonly used. Death and death/disease relapse are the endpoints for OS and DFS, respectively. Aiming at depicting the probability of survival in remission and free of comorbidity, Holtan et al. [1] proposed the GVHD-free, Relapse-Free Survival (GRFS). For GRFS, either death, relapse, grade 3-4 acute GVHD, or chronic GVHD requiring immunosuppression is viewed as a terminal failure event. GRFS is effective in measuring the short-term transplant outcome. In a study cohort of 907 patients, Holtan reported the 1-year GRFS to be 31%. The Center for International Blood and Marrow Transplant Research (CIBMTR) suggested 23% to be the reference 1-year GRFS in adult patients in trial studies. GRFS is very useful in trials evaluating different treatments since the target events can be observed within a relatively short time period.
GRFS is not suitable for assessing long-term post-transplant outcomes. First, acute GVHD occurred shortly after transplant. It is more appropriate to consider acute GVHD as a brief transit state. Second, chronic GVHD condition can be resolved in a large proportion of patients. There is no difference in quality of life between patients with resolved chronic GVHD and those never experiencing chronic GVHD. To better evaluate the long-term outcome, our group proposed the Current chronic GVHD-free, Relapse-Free Survival (CGRFS) function, defined as the probability that a patient is alive in remission and free of severe-to-moderate chronic GVHD after transplant at time t [2]. In CGRFS chronic GVHD is not a terminal event. A subject is removed from the risk set at the onset of chronic GVHD. When the chronic GVHD is resolved, the subject is again included in the risk set. Composite endpoints such as DFS or GRFS are commonly used for summarizing outcomes in stem cell transplant studies. For a composite endpoint, only one health state is of study interest. CGRFS is not a composite endpoint since more than one health state are considered in CGRFS. The dynamic nature of CGRFS makes it similar to the current leukemia-free survival (CLFS) proposed by Klein, Szydlo et al. [3]. CLFS was advocated as a better summary curve measuring effectiveness of donor lymphocyte infusion (DLI) post transplant [4]. Post-transplant relapse was treated by DLI and many relapsed patients achieved the second remission. For CLFS, relapse in the initial remission is not a terminal event. Both initial and second remission states are considered in CLFS.
A multistate model should be constructed for a dynamic endpoint [5]. The survival associated with a dynamic endpoint involves the transition probabilities from the initial state to other states. The conventional method for estimating a transition probability is the product-limit estimator [6]. An alternative method for estimating a transition probability was suggested by Pepe [7] by using the difference of two survival functions. For CLFS, Klein, Szydlo et al. [3] employed the conventional product-limit method to estimate the transition probability in CLFS. In another work by Klein, Keiding et al. [8], inspired by Pepe's idea, they proposed the model-free approach to formulate CLFS as a linear combination of three survival functions, and use the Kaplan-Meier estimators to estimate the survival functions. For CGRFS, both conventional product-limit method and the model-free approach can be utilized. The model-free approach has a great advantage because it is much simpler in formulation and computation. In our clinical paper, we suggested the model-free approach and presented the estimation result of a real data set [2]. In recent years other dynamic endpoints have been advocated for stem cell transplant studies [9,10]. The current survival functions were defined for these dynamic endpoints based on specific multistate models, and the model-free approach was utilized to estimate the proposed current survival functions.
CGRFS has been recognized as a useful addition to the existing survival functions [11][12][13] and considered as suitable implication of transplant success [14]. Our clinical paper did not include interval estimation, which is critical for assessing precision of CGRFS estimates. Also the clinical paper lacked details about two-sample comparison. In this paper we present the detailed inferences. In Section 2 we give definition of CGRFS, together with point estimation, precision evaluation and two-sample comparison. In Section 3 we present the analytical results of the data of 422 patients, including the CGRFS curve, the estimated transition probabilities of different health states, and two-sample comparison result. In addition, we explain how to compute probabilities of all states, and suggest other practically meaningful functions. A discussion of CGRFS is given in Section 4.

Estimation of CGRFS
For CGRFS, onset and resolution of chronic GVHD need to be clearly determined. Onset of chronic GVHD event was defined as moderate-to-severe chronic GVHD based on the NIH criteria [15] at the time of the most recent assessment. GVHD evaluation was prospectively performed by a single practitioner within the program. Resolution of chronic GVHD was determined if symptoms became quiescent and systemic immunosuppression discontinued. We constructed a multi-state model to depict the disease progression after transplantation. Two episodes of chronic GVHD were incorporated in the model (Fig. 1 CGRFS is the probability that one stays in state 0 or 4 or 8 at time t. Let P kl (s, t) be the transition probability from state k to l in time interval [s, t]. CGRFS, denoted by C(t), is defined as the sum of three transition probabilities, C(t) = P 00 (0, t) + P 04 (0, t) + P 08 (0, t) . An intensity matrix can be constructed based on the transition intensities, 01 , · · · , 89 (Fig. 1). The conventional method of estimating P 00 (0, t), P 04 (0, t) and P 08 (0, t) is to consider the product integral of the intensity matrix [6].
The product-integral method is computationally expensive. The issue is most severe for P 08 (0, t) because a subject has to experience three states before reaching state 8, alive without the second chronic GVHD. Following Pepe [7] and Klein, Keiding et al. [8], we developed the model-free approach for CGRFS [2]. We formulated CGRFS as a linear combination of five survival functions, S 1 (t), · · · , S 5 (t) , pertaining to five composite endpoints. These five composite endpoints are explained as follows, Let T k be the time to the kth composite endpoint and S k (t) = Pr(T k > t) ( k = 1, · · · , 5 ). Note that S 1 (t) coincides with P 00 (0, t) . The composite endpoint for T 2 is the first occurrence of second chronic GVHD or death/ relapse. For T 2 , the death/relapse event could only be death/relapse in remission, in first chronic GVHD or in resolved first chronic GVHD. A patient without experiencing this composite endpoint stays in state 0, 2 or 4 (see Fig. 1). Using transition probabilities, . Similarly, the composite endpoint for T 3 is the first occurrence of resolution of first chronic GVHD or death/relapse. Without experiencing this composite event, one stays in state 0 or 2. That is, In addition, based on the composite endpoints for T 4 and T 5 , S 4 (t) is the probabilities of staying in state 0, 2, 4, 6 or 8 while S 5 (t) is the probability of staying in state 0, 2, 4 or 6. We can get that P 08 (0, t) = S 4 (t) − S 5 (t) . In summary, CGRFS is a linear combination of these five survival functions, . Among these survival functions, S 1 (t) and S 4 (t) are practically meaningful as they are the relapse-free, chronic GVHD free survival and DFS, respectively. Other survival functions are not interpretable in real life but introduced here to find probability of being in one state. All these five survival functions can be estimated by the Kaplan-Meier method.
To introduce inferences for CGRFS, we used the counting process notations. Let Z be the censoring time. Suppose that data of n patients are collected. The sample can be summarized as (X ik , � ik ) (i = 1, · · · , n; k = 1, · · · , 5) , where X ik = min(T ik , Z i ), � ik = I(T ik ≤ Z i ) and I(•) is an indicator function which takes value 1 if the event happens, and 0 otherwise. N ik (t) indicates whether the ith patient experiences the kth composite endpoint at or prior to t, that is, N ik (t) = I(X ik ≤ t, � ik = 1) and let N k (t) = n i=1 N ik (t) . Also we defined the risk sets related to the five composite endpoints. Let Y ik (t) = I(X ik ≥ t) and  The hazard function of T k is defined as Define y k (t) = n −1 lim n→∞Ȳk (t), ∀k = 1, · · · , 5 . Based on the facts in Andersen et al. [6] and Pepe [7], for large samples, where Based on the martingale central limit theorem, ∀t , √ n C(t) − C(t) converges to a mean-zero normal dis- are not orthogonal because by definition the event counting processes may involve the same events. Consequently the covariance should be considered if one wishes to consider the martingale variation process. Here we alternatively consider a moment estimator of the variance, and The above moment variance estimator is similar to the variance estimation method provided by Klein, Keiding et al. [8] for current LFS. A linear (1 − α)100% confidence interval for CGRFS can be calculated by C ± n −1/2 z 1−α/2 σ (t) . Log-log transformation is routinely used for interval estimation of a survival probability. A (1 − α)100% log-log transformed confidence interval for C(t) is given by

A confidence band for CGRFS
In survival analysis the simulation approach has been commonly used to find the confidence band of a survival function [16,17]. ∀i, k , let G ik (t) be a standard normal random variate. The martingale process t 0 dM ik (u) has the same distribution as t 0 G ik (u)dN ik (u) . Based on this knowledge, we consider the process W (t) that Let Q(t) = √ n C(t) − C(t) , then Q(t) can be approximated by W (t) . To construct a band for C(t) in an interval [t 1 , t 2 ] , one needs to find the critical value such that To obtain a realized process, one can generate standard normal random variates for G ik (t), ∀i, k . Also plug in the Kaplan-Meier estimators for survival probabilities and Y k (t)/n to replace y k (t), (1) Pr sup Given B realized processes, let Ŵ b (t) denote the bth realized process. The critical value is obtained by finding the (1 − α)100th percentile of the supremum values, which is given by where I(•) is the indicator function. A confidence band for C(t) is given by C(t) ± n −1/2q α σ (t), ∀t ∈ [t 1 , t 2 ].

A confidence band for differences in CGRFS between two independent samples
The method described in Section 2.2 can be extended to construct a confidence band for differences in CGRFS between two independent samples. Such a band could tell in what time range that CGRFS's of two groups differ. This type of band is related to the hypotheses H 0 : is CGRFS for sample i. A supremum test for the hypotheses can be developed.
Let C 1 (t) and C 2 (t) be the estimated CGRFS of samples 1 and 2, respectively. Let the processes described in Eq. (2) for samples 1 and 2 denoted by W (1) (t) and W (2) (t) , respectively. Under the null hypothesis, asymptotically √ n{C 1 (t) − C 2 (t)} has the same distribution as W (1) (t) − W (2) (t) . To obtain a standardized realized process, we need to estimate the standard error for is the estimated variances for sample i. A standardized realized process is defined as Û (t) = Ŵ (1) (t) −Ŵ (2) (t) ∕ŜE(t) . We can generate B realized processes The critical value for the band can be obtained by finding the (1 − α)100th percentile among B supremum values, For the supremum test, we evaluate the supremum for the sample data, The test p-value is to the probability of observing K or more extreme values in the sampling distribution of sup t∈[t 1 ,t 2 ] U (t) . The p-value can be obtained by finding the proportion of the supremum values higher or equal to K, In a supremum test, one should determine the time interval [t 1 , t 2 ] in which survival functions of two samples are compared. If the goal is to compare survival functions over the entire study period, one may set min where t (1) min and t (2) min are the smallest event times in samples 1 and 2, respectively. If a large number of subjects remain under study even after the largest event time, one may set t 2 to be the largest event time in two samples.

The real life example
The study cohort consisted of 422 patients receiving an allogeneic transplant at a single institution in 2010 to 2015. The median age was 44 years (range 18 -77). 56% of patients were male. Matched related, matched unrelated and haploidentical donors were used in 125, 165 and 132 patients, respectively. The majority had low or intermediate disease risk index (DRI) (N=291, 69%). About half had hematopoietic cell transplant comorbidity index (HCT-CI) 3 or higher (N=194, 46%). Among 264 survivors the median follow-up time was 36 months (range 11-78 months).
We depicted four survival curves in Fig. 2. The 1-year OS, DFS, conventional GRFS and CGRFS (Table 1) were 0.78 (95% CI 0.74-0.82), 0.68 (95% CI 0.64-0.72), 0.33 (95% CI 0.29-0.38) and 0.45 (95% CI 0.40-0.50), respectively. The 1-year GRFS of this cohort was comparable to the rate reported by Holtan et al. [1] based on a cohort of 907 patients. As shown in Fig. 2, the GRFS curve dropped rapidly within 1 year, followed by a slow decrease in 1 to 3 years, and then the curve became flattened after 3 years. Since majority of events for GRFS occurred within one year, this function is only good for assessing the short-term outcome. At 1 year, the CGRFS estimate was about 0.12 higher than the GRFS estimate and the difference became greater afterwards. Two reasons explain the difference. First, definition of CGRFS does not involve acute GVHD event. Second, a good proportion of chronic GVHD conditions were resolved. At 3 years, OS, DFS, GRFS and CGRFS estimates were 0.61 (95% CI 0.55-0.66), 0.54 (95% CI 0.49-0.59), 0.23 (95% CI 0.18-0.27) and 0.47 (95% CI 0.42-0.52), respectively. CGRFS and DFS estimates were very different at 1 year but became similar at 3 years, indicating that a large proportion of chronic GVHD conditions were resolved.
The CGRFS curve together with its 95% confidence interval were depicted in Fig. 3. The log-log transformation was employed for interval estimation. The confidence  interval was calculated by the formula given in Eq. (1). We also evaluated the confidence interval for DFS (not shown in the paper). The confidence interval for CGRFS is narrower than that of DFS within 1 year but the precision levels for CGRFS and DFS become comparable afterwards. Though we focused on the probabilities of being in states 0, 4 and 8 (alive free of chronic GVHD) in Fig. 1, it is not challenging to find the probabilities of other states. According to the composite endpoints and survival probabilities explained in Section 2.1, we can identify that S 3 (t) − S 1 (t) and S 5 (t) − S 2 (t) yield the probabilities of staying in state 2 and state 6, respectively. Note that states 2 and 6 relate to survival under chronic GVHD condition. Regarding probabilities of state 1, we can see that death/ relapse in remission and onset of first chronic GVHD are two competing risks. Therefore, the probability of state 1 is the cumulative incidence function in the competing risks context and can be estimated by the Aalen-Johansen estimator. Probabilities of other death/relapse states can all be recognized as cumulative incidence functions under competing risks settings. In summary, for i ∈ (1,3,5,7,9), These functions can be estimated by the Aalen-Johansen estimator. In Fig. 4 we depicted the probabilities of being in states 1 to 8. We can see from this figure that very small proportions of patients stayed in states 5 and 7, indicating that patients with resolved chronic GVHD had low chance of experiencing failure events. Note that no one died while alive with resolved second chronic GVHD. Therefore, zero percent of patients stayed in state 9.
As shown in Fig. 4, 28% of patients had entered state 2 (first chronic GVHD) within one year. At 1 year, 23%, 3% and 2% of patients were staying in states 2, 3 (death/ relapse in first chronic GVHD) and 4 (resolution of first chronic GVHD), respectively. These numbers indicate that a high proportion of patients developed chronic GVHD within one year of transplantation. In some patients, the chronic GVHD condition was resolved very soon as 2% became GVHD free by 1 year, while 3% died or relapsed by 1 year. As time went by, relative more patients resolved their GVHD condition rather than experienced failure event. At 2 years, 32% patients had developed initial chronic GVHD and entered state 2. Among them, 7% died or relapsed while 13% resolved chronic GVHD condition and transited to state 4. Only 12% were alive with GVHD condition. At 2 years, in patients with resolved GVHD, a small proportion of them (1.5%) experienced second episode of chronic GVHD, while the majority (11.5%) remained relapse-free and GVHD-free. Association of demographics and clinical characteristics with CGRFS was evaluated by the supremum test described in Section 2.3. We chose to conduct the supremum test in the time interval [t * , 4] years, where t * is the larger value between two smallest event times of two samples. Only a few CGRFS events occurred after 4 years. Therefore we truncated at 4 years to avoid the high variability at the tails of CGRFS curves. We evaluated the following factors: age (<55, ≥55), gender, Dana-Farber risk index (DRI) (low/intermediate, high/very high), HCT-CI (0-2, ≥3), donor type, stem cell source (bone marrow, PBSC), diagnosis, conditioning intensity, CMV status, and year of transplantation (2010)(2011)(2012)(2013)(2014)(2015). Based on the supremum test results, only DRI had significant effect on CGRFS ( P < 0.01 ). Compared to the high or very high risk in DRI, patients with low or intermediate risk had significantly higher chance to stay in leukemia and chronic GVHD free status (Fig. 5).
Other outcomes can be generated from the multistate model in Fig. 1. For example, Pepe [7] mentioned that the prevalence of chronic GVHD in leukemia-free patients provides a measure of quality of life. Based on the states in Fig. 1, this prevalence is given by As another example, it is interesting to examine whether the chance of failure would be higher when a chronic GVHD is resolved in a patient compared to one staying in the initial remission state. To answer this question and given that we are interested in recovery from the first chronic GVHD only, we can consider the following function, Based on our clarification on the probabilities of all states given in this section, it is straightforward to estimate the functions in Eqs. 4 and 5. The bootstrap method can be used for interval estimation.

Discussion
Pepe [7] initially discussed the model-free approach for estimating probability of chronic GVHD. CGRFS was introduced to accommodate a more general context including two episodes of chronic GVHD. It reflects both onset and resolution of initial and recurred chronic GVHD. A CGRFS curve is a useful supplement to the conventional OS, DFS and GRFS curves. For DFS, a patient is alive without relapse but may still suffer from chronic GVHD. Different from DFS, CGRFS shows the probability of staying in a better health status and is a meaningful measure of good quality of life. In the example, we explained how to estimate the probabilities of all states. Based on these results, we will be able to find the probabilities of survival with chronic GVHD (sum of probabilities of states 2 and 6). Suppose that we consider CGRFS as the perfect health condition and assign the utility value 1. If a utility for survival with chronic GVHD is provided, the quality-adjusted lifetime can be evaluated. This quantity will be another tool for outcome assessment.
In this paper we presented the inferences for CGRFS. Using the model-free approach, CGRFS can be conveniently estimated by a linear combination of Kaplan-Meier estimators of five survival functions. Computation can be done by invoking the build-in functions in statistical software and performing basic data manipulation. More specifically, one can use a build-in function, such as the LIFETEST procedure in SAS, to compute Kaplan-Meier estimates of relevant survival functions. The next step is to merge the survival probability estimates by time, and then evaluate the linear combination.
Suppose that there exists only one episode of chronic GVHD. Such a setting is described by states 0 to 5 only, which cover the states related to the first chronic GVHD, and CGRFS reduces to S 0 (t) + S 3 (t) − S 2 (t) . The inferential methods provided in Section 2 can be simplified by removing terms related to S 4 (t) and S 5 (t).
Relapse in CGRFS is treated as a terminal event. If relapse would rather be considered as a curable condition, only minor changes are needed for the underlying multistate model. First, death becomes the only terminal event in states 1, 3, 5, 7 and 9 (states of the terminal event). Second, occurrence of either chronic GVHD or relapse triggers entrance to state 2 and state 6 (states of the diseases). Third, one in state 2 transits to state 4, as well as transition from state 6 to 8, where states 6 and 8 indicate disease resolutions, only if the person is free of chronic GVHD and relapse. Estimation methods discussed in the paper are still applicable for CGFRS based on such a multistate model.
For studies focus on post-GVHD performances, e.g., a study to evaluate efficacy of different treatments for chronic GVHD, the multistate model can be revised by removing states 0 and 1 (initial remission and death/ relapse in remission). Under this reduced multistate model, the time origin becomes onset of chronic GVHD. Sum of probabilities of states 4 and 8 (survival without chronic GVHD) can be used as the function for outcome assessment. One can follow the methods presented in Section 2 to develop relevant inferences.